UB at TREC Genomics 2006: Using Passage Retrieval and Pre-Retrieval Query Expansion for Genomics IR
نویسنده
چکیده
This paper presents the results of the University at Buffalo (UB) in TREC genomics. For this task we used the SMART retrieval system and a pre retrieval expansion method that uses the ABGene and MetaMap tools. We tried two different weighting schemes one using pivoted length normalization (Lnu.ltu) and another using augmented tf-idf (atn.ann). The results show that performance of pivoted length normalization is very close to the median system that participated in the Genomics track. The augmented tf-idf performs significantly above the median system showing an improvement of 21%. This seems to indicate that a simpler weighting scheme could work better for retrieval of relevant passages.
منابع مشابه
York University at TREC 2006: Genomics Track
Our Genomics experiments mainly focus on addressing four problems in biomedical information retrieval. The four problems are: (1) how to deal with synonyms? (2) how to deal with the frequent use of acronyms? (3) how to deal with homonyms? (4) how to deal with the document-level retrieval, passagelevel retrieval and aspect-level retrieval? In particular, we use the automatic query expansion algo...
متن کاملNTU at TREC 2006 Genomics Track
In this paper, we present a system for information retrieval of biomedical texts at passage level. Our system used KL-divergence as the underlying retrieval model. We further added query expansion and performed post-processing on the results. We were able to obtain a Document MAP of 0.3563, Passage MAP of 0.0464 and Aspect MAP of 0.2255 on one of the three runs.
متن کاملA comparative analysis of retrieval features used in the TREC 2006 Genomics Track passage retrieval task
OBJECTIVE Identify the set of features that best explained the variation in the performance measure of TREC 2006 Genomics information extraction task, Mean Average Passage Precision (MAPP). METHODS A multivariate regression model was built using a backward-elimination approach as a function of certain generalized features that were common to all the algorithms used by TREC 2006 Genomics track...
متن کاملCombining Multiple Resources, Evidences and Criteria for Genomic Information Retrieval
We participated in the passage retrieval and aspect retrieval subtasks of the TREC 2006 Genomics Track. This paper describes the methods developed for these two subtasks. For passage retrieval, our query expansion method utilizes multiple external biomedical resources to extract acronyms, aliases, and synonyms, and we propose a post-processing step which combines the evidences from multiple sco...
متن کاملFactors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies
OBJECTIVE The aim of this work is to evaluate a set of indexing and retrieval strategies based on the integration of several biomedical terminologies on the available TREC Genomics collections for an ad hoc information retrieval (IR) task. MATERIALS AND METHODS We propose a multi-terminology based concept extraction approach to selecting best concepts from free text by means of voting techniq...
متن کامل